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As second-generation gravitational-wave detectors prepare to analyze data at unprecedented sen¬ 
sitivity, there is great interest in searches for unmodeled transients, commonly called bursts. Sig¬ 
nificant effort has yielded a variety of techniques to identify and characterize such transient signals, 
and many of these methods have been applied to produce astrophysical results using data from 
first-generation detectors. However, the computational cost of background estimation remains a 
challenging problem; it is difficult to claim a 5cr detection with reasonable computational resources 
without paying for efficiency with reduced sensitivity. We demonstrate a hierarchical approach to 
gravitational-wave transient detection, focusing on long-lived signals, which can be used to detect 
transients with significance in excess of 5a using modest computational resources. In particular, we 
show how previously developed seedless clustering techniques can be applied to large datasets to 
identify high-significance candidates without having to trade sensitivity for speed. 


Introduction. With second-generation gravitational- 
wave (GW) detectors coming online later this year, the 
first direct detection of GWs may be near. In order to 
establish the significance of a detection, it is common to 
report a false alarm probability (FAP), which quantifies 
the probability that a noise fluctuation could produce 
an event at least as loud as the observed candidate (as 
measured by some detection statistic). In some subfields, 
e.g., particle physics, “5tr significance,” corresponding to 
FAP « 2.9 x 10~ 7 , is used as a detection threshold. 

In order to estimate the FAP of a GW candidate, it is 
common to perform time-shifts in which the GW strain 
time series from one detector is shifted with respect to the 
series from a second detector by an amount greater than 
both the travel time between the detectors and the co¬ 
herence time of the signals being targeted. Time-shifting 
preserves non-Gaussian and non-stationary features that 
characterize the zero-lag (no time-shift) noise, while si¬ 
multaneously eliminating true GW signals. By perform¬ 
ing N time-shifts, it is possible to generate a distribution 
of the detection statistic, which can be used to estimate 
FAP to a level of > 1/N. The 5 a threshold corresponds 
to iV « 3.5 x 10 6 . In many cases it is computationally 
impractical to carry out this many time-shifts, though, 
it has been accomplished in the “detection” of a LIGO 
blind injection with a matched filter search [1], Despite 
the pervasive use of time-shifts, there are limitations [2|. 

For many transient GW searches, the significant com¬ 
puting costs incurred by background estimation arise 
from a desire to use a coherent detection statistic. Co¬ 
herent algorithms utilize the complex-valued cross-power 
obtained by cross-correlating strain data from >2 de¬ 
tectors instead of or in addition to the incoherent auto¬ 
power observed in each detector separately; see, e.g., @- 
[6j. The extra phase information helps differentiate sig- 
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nal from background, improving the sensitivity of the 
search. However, the cost of background estimation for 
coherent searches is relatively large compared to a com¬ 
parable incoherent search because the detection statistic 
must be recalculated for each time-shift, after the fresh 
application of a clustering algorithm. Some algorithms 
use single-detector auto-power exclusively, which allows 
for much more rapid background estimation [ji, j8]. 

In this Letter, we describe a hierarchical approach to 
background estimation in the context of a search for long- 
lived, unmodeled GW transients using seedless cluster¬ 
ing O- First, we identify “events” using a computa¬ 
tionally intensive, but incoherent, single-detector statis¬ 
tic. Second, we calculate a computationally fast, coher¬ 
ent detection statistic for each event identified with the 
single-detector statistic. The second, coherent detection 
statistic is used to evaluate significance. By splitting the 
calculation into an incoherent stage and a coherent stage, 
the computationally intensive calculations are carried out 
just once, allowing rapid background estimation without 
sacrificing the sensitivity gained by the use of coherence. 

We demonstrate this technique by estimating the 
background—past the 5 a level for one week of sim¬ 
ulated Monte Carlo data and one week of Initial 
LIGO noise, recolored to resemble data from Advanced 
LIGO El- We calculate the sensitivity of this mock 
search for several toy model waveforms and find that it 
is not adversely affected by the incoherent stage. The 
remainder of this Letter is organized as follows. We re¬ 
view the basics of transient identification with seedless 
clustering and describe the details of the new hierarchi¬ 
cal detection statistic, we describe a mock data analysis 
carried out on one week of Monte Carlo data and one 
week of recolored Initial LIGO noise, and we present re¬ 
sults demonstrating the ability to estimate background 
at the 5 <t level. 

Method. In previous work, we have described seed¬ 
less clustering Hm EMI, a technique in which GW 
transients are identified by looking for clusters of excess 
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coherence integrated along many different parametrized 
curves through frequency-time space. In particular, cu¬ 
bic Bezier curves [15j provide a useful family of curves 
suitable for the detection of bursting sources lasting tens 
of seconds to weeks, which slowly evolve in frequency, 
but which are approximately narrowband on short time 
scales @,0,111. Such long-lived signals [Ej], created, e.g. , 
by rotational instabilities in nascent neutron stars 16- 
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191 or by the fragmentation of an accretion disk 
221 ]. are potentially detectable by second-generation de¬ 
tectors @,0. 

Previous applications of seedless clustering have been 
employed to look for clusters of excess coherence using a 
coherent statistic 00: 


P (t'J) = \J 'PW; f)Pj{t] /). (1) 

Here, s/(t; /) is the Fourier transform of the strain time 
series in detector / for a data segment centered on time 
t with frequencies f. Pj(t;f) is the auto-power spec¬ 
trum calculated using (typically nine) neighboring seg¬ 
ments while M is a Fourier normalization constant. By 
integrating p(t; /) along suitable curves in frequency-time 
space (and applying a phase factor to “point” in different 
sky directions), one can define a signal-to-noise ratio for 
the cluster [251 ] 

SNR ‘°t = ^72 E Re[e 2 ™ /Ar p (*;/)]. ( 2 ) 
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Here, T is a template describing the set of spectrogram 
pixels in the parametrized (Bezier) curve, N is the num¬ 
ber of pixels in the curve, and At is the direction- 
dependent time delay between two detectors. 

By calculating SNR tot for many different templates T 
and for many different time delays AtAI is possible to 
find very weak signals buried in noise jlOj. The loudest 
event in each p(f;/) spectrogram is deemed to have a 
signal-to-noise ratio: 


SNR“f = max [SNRtot] 


( 3 ) 


SNR[0t X is a coherent detection statistic. The observed 
value of SNR[0t X is compared to a distribution of SNR]^ 
generated using time-shifts in order to assign a false- 
alarm probability. 

To reap the benefits of seedless clustering, it is desir¬ 
able to employ a large number of templates: 0(10') for a 
single ~300 s spectrogram The calculation can be sped up 
by parallelization, but it becomes prohibitive to repeat 
> 10 6 times on time-shifted data, the number of real¬ 
izations required to carry out 5 a background estimation. 
In order to eliminate this computational bottleneck, we 
introduce a single-detector, incoherent statistic 

= / p l(t-J). (4) 


Note that I/(f;/) is equivalent to p(f;/) for the case 
where the two detector indices are equal / = J. It is 
the ratio of the auto-power in detector / at time t to the 
auto-power in detector I at the times neighboring t. 

While the seedless clustering algorithms described 
above were developed in the context of a coherent search 
using p(f;/), it is straightforward to apply them to the 
new incoherent statistic [j(f; /) by simply defining a new 
single-detector signal-to-noise ratio: 

SNR = E kft/)- ( 5 ) 

R;/}er 


The phase factor from Eq. [2] vanishes because the time 
delay between two (now identical) detectors is zero. As 
before, we identify the most significant cluster in each 
spectrogram. It is necessary to do this separately for 


each detector: SNR^ X ^ = maxr 


SNR& 


While 1 i{t; f) and p i(t;f) are similarly defined, it is 
important to note differences in their statistical behav¬ 
ior. If no signal is present, the distribution of Re [p(t; /)] 
is peaked symmetrically about zero. However, the distri¬ 
bution of l/(t; /) is positive definite and asymmetric with 
a peak near one. Thus, SNR^ X [calculated from pi(t; /)] 
and SNR^ X [calculated from lj(t;/)] have very dif¬ 
ferent distributions. While SNR^ X = 10 corresponds to 
a highly significant event, SNRj.”^ < ' I> = 29 is typical for 
noise. 

If we stopped here, and used SNR]^’^ 7 ' 1 as a detection 
statistic, the method would rely on an incoherent statis¬ 
tic. However, we can do much better by using clusters 
identified with SNR^ x * / ' ) to calculate a coherent statis¬ 
tic. The loudest cluster as ranked by the single-detector 
statistic is denoted T/. The coherent signal-to-noise ratio 
for this cluster is: 


A (/) = ^max Y, Re[e 2 -^p(i;/)]. (6) 

{*;/}£ r. 


The maxAr term indicates that the sum is carried out 
for 400 evenly spaced values of r, sufficient to match 
the diffraction-limited resolution. represents the co¬ 
herent signal-to-noise ratio in detectors I and J for the 
loudest cluster identified using only the auto-power from 
detector I. Last, we define the detection statistic as 
the maximum value of among the two detectors: 
A = max/ AR) . 

To illustrate why this hierarchical design is useful, it is 
helpful to describe the procedure of background estima¬ 
tion as a series of numbered steps. 

1. We break the coincident data into conveniently long 
spectrograms to be analyzed for clusters. In this 
Letter, we use 50%-overlapping, «300s spectro¬ 
grams. 

2. For each spectrogram, we identify the loudest clus¬ 
ter F/ in each detector using the single-detector, 
incoherent statistic SNR^j X . 
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3. If SNR“r is less than some pre-determined 
threshold SNR^\ proceed no further. The cluster 
is not promising enough to spend time calculating 
the coherent statistic. 

4. For each spectrogram- if there is a cluster passing 
the cut SNR“^ X ^ < SNR^—we calculate the co¬ 
herent detection statistic A. 

5. Take the spectrogram data produced in step 1 and 
time-shift the clusters in one detector to create a 
new noise realization. Repeat steps 2-4 to generate 
a background distribution for A. 

Since steps 2-3 use a single-detector statistic, we obtain 
the same list of single-detector clusters every time—it 
does not matter if the data streams are shifted with re¬ 
spect to one another. This means that we can carry out 
steps 2-3 just once and reuse the results for subsequent 
time-slides. This is important because step 2 (cluster 
identification) is the computationally expensive step. We 
still need to recalculate the coherent statistic A for each 
time-slide, but this is a cheap calculation since we have 
reduced the number of templates from (D(10 7 ) per spec¬ 
trogram to one (and only a fraction of spectrograms will 
contain a cluster passing the step 3 cut). The zero-lag 
data are analyzed identically with the same hierarchical 
process, ensuring that the background estimation can be 
used to identify detections. 

Mock data analysis. We carry out two mock data anal¬ 
yses. For the first analysis, we analyze Monte Carlo noise. 
The simulated noise is Gaussian, stationary in time, and 
colored according to the design sensitivity of Advanced 
LIGO 0- For the second analysis, we analyze time- 
shifted data from Initial LIGO, which has been recolored 
to resemble Advanced LIGO noise at design sensitivity. 
The recolored noise preserves non-Gaussian noise arti¬ 
facts that are present in real data, but not in Monte 
Carlo simulations. Of course, the character of the non- 
Gaussian noise present in Advanced LIGO noise is likely 
to be different from that of Initial LIGO as the detectors 
are significantly different. The recoloured noise results 
should therefore be taken as an plausible approximation 
of Advanced LIGO noise. In both cases, we use a two- 
detector network consisting of the LIGO Hanford and 
Livingston observatories. 

In order to estimate the background for one week of 
data, we analyze 16.7 days of data, corresponding to 10 4 , 
288 s-long, 50%-overlapping spectrograms. The spectro¬ 
grams span a frequency range of 100-1800 Hz and are 
composed of 1 Hz x 1 s pixels, overlapping 50% in time. 
This choice of spectrographic resolution is suitable for 
many long-lived transient waveforms with slowly varying 
frequency [o. jU EB E3 • 

In order to impose the constraints of a realistic search, 
we assume that the seedless clustering parameters are 
tuned just once. We employ cubic Bezier curves with a 
minimum duration of 40 s. For each spectrogram, we em¬ 
ploy 10 7 templates to find the loudest auto-power cluster 


waveform 

duration (s) 

fmin /max (Hz) 

median strain 

ADI 1 

39 

130-170 

4 x 10 -24 

ADI 2 

230 

110-260 

2 x 10~ 23 

FA 2 

200 

790-1080 

2 x 10~ 22 


TABLE I: Summary of the waveforms used in this study; 
see 0,0 for additional details. Accretion disk instability 
(ADI) waveforms are down-chirping while fallback accretion 
(FA) waveforms are up-chirping. The ADI waveforms have 
been normalized so that the isotropic equivalent energy is 
0.1 Mg. The median strain is for face-on systems at 100 Mpc. 

for each detector T/. The difference in frequency between 
the first and third Bezier points determine the extent to 
which the signal may vary in time. We limit this vari¬ 
ation so that the frequency of the third control point is 
within ±50% of the frequency of the first control point. 

We consider three waveforms, which we have pre¬ 
viously used to benchmark past studies using seed¬ 
less clustering In particular, we employ two 

down-chirping accretion disk instability (ADI) signals 
from [24| and one up-chirping fallback accretion (FA) 
signal from pjj[. Following , the accretion disk 

instability signals are scaled so that the isotropic equiv¬ 
alent energy is 0.1 Mg. The spectrographic properties of 
the signals are described in Tab.Q] We follow the naming 
conventions adopted in I,®- 

We do not include the results for a fourth waveform 
FA 1 from [§, ; T(|, which persists for 20 s and evolves 
from 790-1080 Hz. The implementation described above 
performed poorly detecting this waveform because it is 
relatively short and it evolves rapidly in frequency. It is 
possible that we could effectively detect signals of this 
type using a different tuning from the one we describe 
above, but we seek to apply a uniform set of parameters, 
which are optimized for longer, less-quickly-evolving sig¬ 
nals. 

Above, we listed five steps necessary for carrying out 
the hierarchical search. The blue curve in Fig. [lji 
shows the FAP associated with the single-detector statis¬ 
tic SNR^ X l ' 1 ' 1 obtained after steps 1-2 with recolored 
noise. FAP is defined for a single 288 s spectrogram 
so FAP = 10” 2 corresponds to a false alarm rate of 
FAR = (1/8) hr -1 . The vertical lines in Fig.QJr show the 
median values of the single-detector statistic SNR“j X ^ 
for data that includes an injected ADI 2 signal injected 
face-on at an optimal sky location. The different colors 
indicate different source distances. 

For step 3, it is necessary to define a threshold SNR^ 
for the single-detector statistic. This threshold is applied 
in order to control the computational costs. By throw¬ 
ing out below-threshold candidates, we reduce the num¬ 
ber of surviving clusters to include in subsequent steps. 
The choice of SNR.[^ is a balancing act. Increasing the 
threshold reduces the computational burden, but it can 
also reduce sensitivity by inadvertently discarding true 
signals. For the Monte Carlo analysis described here, we 
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FIG. 1: Left (a): FAP versus the single-detector statistic SNR]^™ < ~ I \ The blue curve indicates the distribution of recolored 
noise for Advanced LIGO at design sensitivity. The vertical lines show the median values for data including an accretion disk 
instability signal (ADI 2; face-on, optimal orientation) with different colors corresponding to different distances. The dashed 
black lines show which data are removed via the single-detector cut. Right (b): FAP versus the coherent statistic A. The 
blue curve indicates the distribution of recolored noise for Advanced LIGO (Hanford and Livingston) at design sensitivity. The 
vertical lines show the median values for data including an accretion disk instability signal (ADI 2; face-on, optimal orientation) 
with different colors corresponding to different distances. The dashed black lines indicate the threshold required for 3a, 4a, 
and 5a detection in one week of data. 


find that SNR^ = 29.4 eliminates 99% of background 
triggers (thereby reducing the computational burden of 
steps 4-5 by a hundred fold) without severely harming 
sensitivity. The step 3 cut is indicated by dashed black 
line on Fig. QJt. Note that (face-on, optimally oriented) 
signals with distance <260 Mpc are likely to survive the 
cut. For recolored noise, the threshold is selected to be 
SNR^ = 30.6. 

Having eliminated 99% of the clusters, we proceed to 
calculate A. In Fig. Q}), we plot FAP as a function of A 
for recolored noise. The different colored vertical lines 
indicate the median value of A for injected signals (face- 
on, optimal orientation). If an injected signal does not 
pass the step 3 auto-power cut in at least one detector, 
it is assigned a value of A = 0. The blue curve starts at 
FAP « 10 -2 because 99% of the clusters are removed by 
the auto-power cut. The horizontal dashed lines indicate 
the required significance for 3a, 4cr, and 5er detections 
in one week of data. Recall that FAP is measured in 
reference to a 288 s spectrogram. One week corresponds 
to 4800 (50%-overlapping) spectrograms and so 5 ct cor¬ 
responds to FAP = 7 x 10 -11 . 

From Fig. [TJd, it is apparent that the <260 Mpc sig¬ 
nals that we avoided cutting in step 3 (Fig. [lji) are de¬ 
tectable with the coherent A statistic at the >4er level. 
From Figs, [lji and b, we see why the hierarchical ap¬ 
proach is superior to the single-detector statistic. The 
coherent statistic yields a FAP which is many orders of 
magnitude smaller. Next, we consider how the sensitiv¬ 
ity is affected by the use of an incoherent statistic as an 
intermediate step. We compare A to SNR^ X for mock 
signals that are loud enough to detect with the hierar¬ 


chical scheme at 5cr and find that they are comparable 
to within 37%. Somewhat surprisingly, the hierarchical 
scheme is actually slightly more sensitive in this regime 
because it searches the sky in a systematic way whereas 
the SNR^ X statistic pairs each track with a random sky 
position guess E3- 

The initial clustering procedure (corresponding to 
steps 1-3) is carried out on Kepler GK104s Graphics Pro¬ 
cessor Units (GPUs) at a cost of 1.2 continuously running 
GPUs. We find that 18 Intel Xeon E5-2670 Central Pro¬ 
cessing Unit (CPU) cores are required to match the per¬ 
formance of one GPU. Steps 4-5, in which we time-shift 
the data in order to calculate the coherent statistic A, are 
performed using single CPU cores at a cost of 5.8 contin¬ 
uously running cores in order to achieve the background 
estimation necessary to identify 5tr detection candidates. 
The calculation in steps 4-5 does not, at present, bene¬ 
fit dramatically from GPU acceleration. To put this in 
perspective, a fully coherent search, calculating SNR to t 
for each time-slide with seedless clustering would require 
2.2 x 10 6 continuously running GPUs. 

In Tab. El we summarize the results of a sensitivity 
study in which we estimate 5cr detection distance: the 
distance at which we can detect a waveform with a FAP 
corresponding to 5a with a false dismissal probability 
(FDP) of 50%. We consider the cases of face-on, opti¬ 
mally oriented sources (useful for comparison with previ¬ 
ous work 0113 ) and also randomly oriented sources with 
random sky locations. The first column is the waveform, 
the second is the type of noise (MC = Monte Carlo or RC 
= recolored noise), and the second is the 5er detection dis¬ 
tance. The RC detection distances are sometimes smaller 
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than the MC distances because non-stationary noise arti¬ 
facts present in real data tend to decrease the sensitivity 
compared to idealized Gaussian noise. 


Waveform 

Noise 

5cr distance (Mpc) 

ADI 1 optimal 

MC 

250 

RC 

250 

ADI 1 random 

MC 

150 

RC 

127 

ADI 2 optimal 

MC 

232 

RC 

232 

ADI 2 random 

MC 

139 

RC 

130 

PT 2 optimal 

MC 

30 

RC 

25 

PT 2 random 

MC 

18 

RC 

14 


TABLE II: Five sigma detection distances (FDP = 50%) for 
the three different test waveforms summarized in Table [T] in 
Gaussian Monte Carlo noise (MC) and Initial LIGO noise, re¬ 
colored to match Advanced LIGO at design sensitivity (RC). 
We assume Hanford and Livingston detectors operating at de¬ 
sign sensitivity. “Optimal” means that the source is face-on 
and optimally oriented to maximize detectability. “Random” 
means that the orientation and sky location of the source are 
chosen from random distributions. 

In order to put Tab. [TT] in context, we compare the 
face-on, optimal sky location, 5(7 detection distances pre¬ 
sented here—estimated for one week of data—to the 


3.3cr detection distances from 0 —estimated for a sin¬ 
gle 288 s spectrogram with a smaller 150 Hz-wide band. 
Despite the fact that the results here include an addi¬ 
tional 4.6 x 10 8 trial factor, the hierarchical 5a detection 
distances for Monte Carlo are only 0.45 — 0.82 times less 
than the 3cr detection distances from 0 obtained us¬ 
ing seedless clustering. For recolored noise the ratios are 
0.50 — 0.96. The event rate and loudness of long-lived 
transients are unknown, but the sensitivity we demon¬ 
strate here is sufficient to potentially detect long-lived 
signals with second-generation detectors [HI, 0 . 

For illustrative purposes, we focus here on the detec¬ 
tion of >5(7 events with minimal resources. As a result, 
the aggressive auto-power cut eliminates some fraction 
of <5(7 signals that could be detected with a different 
tuning. However, the algorithm can be tuned differently 
to balance cost with sensitivity for different FAPs. For 
future work, it is worthwhile considering if this strategy 
can be usefully employed in other situations where the 
character of the noise is less well-behaved. Moreover, the 
general strategy we outline here—splitting a search into 
a computationally expensive “incoherent” step followed 
by a computationally cheap “coherent” step in order to 
facilitate rapid background estimation—may find use in 
the broader community. For example, a similar scheme 
could prove useful in determining the significance of po¬ 
tentially faint electromagnetic signatures found (by > 2 
telescopes) in coincidence with gravitational-wave detec¬ 
tions. We thank Anthony Piro for sharing the fallback 
accretion waveforms used in this analysis. We thank Pe¬ 
ter Shawhan and Jonah Kanner for helpful comments. 
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